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Evolutionary conservation of the eumetazoan gene 
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Despite considerable differences in morphology and complexity of body plans among animals, a great part of the gene set 
is shared among Bilateria and their basally branching sister group, the Cnidaria. This suggests that the common ancestor 
of eumetazoans already had a highly complex gene repertoire. At present it is therefore unclear how morphological 
diversification is encoded in the genome. Here we address the possibility that differences in gene regulation could con- 
tribute to the large morphological divergence between cnidarians and bilaterians. To this end, we generated the first 
genome-wide map of gene regulatory elements in a nonbilaterian animal, the sea anemone Nematostella vectensis. Using 
chromatin immunoprecipitation followed by deep sequencing of five chromatin modifications and a transcriptional co- 
factor, we identified over 5000 enhancers in the Nematostella genome and could validate 75% of the tested enhancers in 
vivo. We found that in Nematostella, but not in yeast, enhancers are characterized by the same combination of histone 
modifications as in bilaterians, and these enhancers preferentially target developmental regulatory genes. Surprisingly, the 
distribution and abundance of gene regulatory elements relative to these genes are shared between Nematostella and 
bilaterian model organisms. Our results suggest that complex gene regulation originated at least 600 million yr ago, 
predating the common ancestor of eumetazoans. 



[Supplemental material is available for this article.] 

It is currently not known to what extent the evolution of mor- 
phological complexity was driven by changes in gene content or 
differences in gene regulation. The sea anemone Nematostella vec- 
tensis belongs to cnidarians, the sister group to all bilaterians, 
which diverged from bilaterians more than 600 million yr ago. 
Cnidarians consist of only two germ layers (endoderm and ecto- 
derm) and have roughly 12-20 morphologically distinct somatic 
cell types. Yet, the Nematostella gene repertoire is surprisingly 
similar to that of vertebrates (Putnam et al. 2007). Despite the 
simplicity of its morphology and of gene expression patterns 
(Steele et al. 2011; Technau and Steele 2011), the complexity of 
developmental gene families (e.g., Wnt, homeobox transcription 
factors) in Nematostella is similar to that found in vertebrates 
(Kusserow et al. 2005; Chourrout et al. 2006; Ryan et al. 2007), and 
its genome harbors many genes important for the development of 
key bilaterian traits such as mesoderm and bilaterality (Finnerty 
2004; Fritzenwanker et al. 2004; Martindale et al. 2004; Technau 
et al. 2005; Steele et al. 2011; Technau and Steele 2011). The simple 
body plan of an animal with such a complex gene repertoire led to 
the assumption that the complexity of gene regulation may be 
different between bilaterians and nonbilaterians and that these 
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differences might underlie apparent morphological differences. 
Indeed, the evolution of new body plans is often driven by changes 
in the regulation of gene expression (Carroll 2008; Wittkopp and 
Kalay 2011) and many differences between closely related species 
are caused by changes in ds-regulatory elements that regulate the 
expression patterns of their target genes in a modular way (Sucena 
et al. 2003; Jeong et al. 2008; Rebeiz et al. 2009; Chan et al. 2010; 
Frankel et al. 2012). 

In bilaterians, genes are regulated not only by proximal pro- 
moters, but also by a combination of distal ds-regulatory elements. 
Each gene can have multiple enhancer elements, which are often 
found at large distances from the gene they regulate (Spitz and 
Furlong 2012; Yahez-Cuna et al. 2012). Therefore, an important 
initial step in comparing gene regulation between evolutionary 
distant species is to identify enhancer elements throughout their 
genomes. Despite their importance, enhancers have been very 
difficult to locate until recently. However, with the advent of 
microarray and next-generation sequencing technologies, ge- 
nome-wide maps of enhancer elements have been predicted in 
human cell lines (Heintzman et al. 2007, 2009; The ENCODE 
Project Consortium 2012) and several major bilaterian model or- 
ganisms (Visel et al. 2009a; Gerstein et al. 2010; The modENCODE 
Consortium et al. 2010; May et al. 2011; Negre et al. 2011; Shen 
et al. 2012). This was achieved not only by mapping the binding 
sites of a number of tissue-specific transcription factors (Gerstein 
et al. 2010), but also based on characteristic histone modifications 
and the binding of transcriptional cofactors (Visel et al. 2009a; The 
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modENCODE Consortium et al. 2010; May et al. 2011; Negre et al. 
2011; Shen et al. 2012). Initially, it was shown in mammalian cell 
lines that while H3K4me3 levels are higher at promoters, 
H3K4mel is more enriched at enhancer elements (Heintzman 
et al. 2007, 2009; Supplemental Table SI). This observation has 
recently been extended to Drosophila as well as vertebrate model 
organisms (Negre et al. 2011; Bogdanovic et al. 2012; Bonn et al. 
2012). It has been suggested that inactive enhancers can be dis- 
tinguished from active ones due to the fact that active enhancers 
are also associated with acetylated histones (Creyghton et al. 2010; 
Rada-Iglesias et al. 2010). One of the enzymes that catalyzes the 
acetylation of histones, the transcriptional cofactor EP300 (also 
known as p300), has been shown to bind to enhancers, and has 
recently been used in several studies to predict enhancer elements 
genome wide (Visel et al. 2009b; May et al. 2011; Negre et al. 2011; 
Shen et al. 2012). Thus, while these studies underlined the com- 
plexity of ds-regulatory landscapes in vertebrates and insects, their 
evolutionary origin remains unclear, as no studies of nonbilaterian 
animals have been performed. 

Here, we profiled five histone modifications (Supplemental 
Table SI) and the binding sites of the Nematostella histone acety- 
lase p300 throughout the Nematostella genome. Our analysis 
revealed that the distribution of histone modifications across the 
Nematostella genome, relative to genes, as well as their colocaliza- 
tion with DNA methylation (Zemach et al. 2010), were remarkably 
similar to what has been shown in bilaterian model organisms. 
This allowed us to use this data to predict more than 5000 en- 
hancer elements across two developmental stages of Nematostella. 
A total of 75% of the predicted enhancers tested in vivo drove 
tissue-specific expression, suggesting that they represent impor- 
tant gene regulatory elements. Upon comparison with predicted 
enhancer maps in bilaterians, we could show that the genomic 
distribution and location relative to genes is shared between cni- 
darians and bilaterians. 

Results 

Distribution of histone modifications across Nematostella genes 

Using antibodies recognizing modified histone H3 tails, we 
established a chromatin immunoprecipitation (ChIP) protocol to 
determine the distribution of several "active" chromatin modifi- 
cations in Nematostella vectensis adult female polyps (whole ani- 
mals). ChlP-enriched and input DNA was subjected to deep se- 
quencing and the reads were aligned to the Nematostella genome 
(Putnam et al. 2007), resulting in highly reproducible data sets of 
histone H3 di- and trimethylation of Lysine 4, trimethylation of 
lysine 36, and acetylation of Lysine 27 (Supplemental Tables SI, 
S2). We calculated and compared the average enrichment of 
H3K4me3 relative to the transcription start sites (TSSs) of genes in 
Nematostella and several other model organisms. As expected, 
H3K4me3 showed the strongest enrichments slightly downstream 
from TSSs in all species, including Nematostella (Fig. 1A). This 
suggests that its distribution is shared among most eukaryotic ge- 
nomes, which was also the case for the other tested histone 
modifications (Liu et al. 2005; Barski et al. 2007; Rando and Chang 
2009; The modENCODE Consortium et al. 2010; Supplemental 
Fig. SI). However, we noted that while in Nematostella and Dro- 
sophila H3K4me3 peaks downstream from the TSS, mammalian 
genes show additional H3K4me3 just upstream of the TSS. This 
H3K4me3 enrichment just upstream of TSSs could stem from ad- 
ditional transcription in these regions (Supplemental Fig. 2). 



H3K4me3 and CpG methylation target mutually exclusive 
regions 

In vertebrates, DNA methylation at CpG dinucleotides is found 
throughout large parts of the genome, but is excluded from sites of 
H3K4 methylation (Meissner et al. 2008; Suzuki and Bird 2008; 
Jones 2012; Long et al. 2012). While the invertebrate model or- 
ganisms Drosophila melanogaster and Caenorhabditis elegans lack DNA 
methylation, it has recently been shown that other invertebrates, 
including Nematostella vectensis, methylate CpGs throughout many 
active gene bodies (Feng et al. 2010; Zemach et al. 2010). To test 
whether the mutually exclusive location of H3K4 methylation and 
DNA methylation is shared between vertebrates and cnidarians, we 
sorted all genes based on their distribution of H3K4me3 around the 
TSS (Fig. IB). While the distributions of H3K4me3, H3K4me2, and 
H3K27ac are very similar (Fig. IB; Supplemental Fig. SI), H3K36me3 
is enriched within gene bodies of the same genes, starting imme- 
diately after the TSS. CpG methylation localizes to gene bodies 
(Zemach et al. 2010) and its levels correlate with H3K36me3 (Fig. 
IB; Supplemental Fig. S3) (P < 2.2 X 10" 16 ). In contrast to 
H3K36me3, CpG methylation begins only about 1 kb after the TSS, 
a position where H3K4me3 is no longer enriched, thus suggesting 
a mutually exclusive location of H3K4me3 and CpG methylation 
in Nematostella. 

Enhancer chromatin modifications at promoter-distal p300 
peaks 

The transcriptional cofactor p300 has been shown to be associated 
with enhancer regions in vertebrates (Visel et al. 2009a; May et al. 
201 1). To find gene regulatory elements throughout the genome of 
Nematostella, we generated an antibody specifically recognizing 
the Nematostella ortholog of the p300/CBP protein, hereafter re- 
ferred to as p300 (Supplemental Fig. S4). We determined the ge- 
nome-wide binding sites of Nematostella p300 in gastrulae (data 
not shown) and planulae (Fig. 2). Gastrulation is an important 
point during development. Yet many developmental regulators 
only start to be expressed after gastrulation, during the develop- 
ment of the planula larva. We found 3354 (55%) and 6460 (60%) 
p300 peaks distal (>300 bp) from the nearest TSS in gastrulae and 
planulae, respectively (data not shown). This suggests that p300 
identifies thousands of putative enhancer regions in the Nem- 
atostella genome. 

In bilaterian model organisms, distal enhancers and promoters 
have different compositions of chromatin marks surrounding them: 
H3K4me3 is located at promoters and H3K4mel preferentially 
surrounds enhancers (Heintzman et al. 2007; The modENCODE 
Consortium et al. 2010; Bonn et al. 2012). We therefore asked 
whether the distribution of chromatin marks around such distal 
p300 peaks is different from the distribution of chromatin marks 
around TSSs in Nematostella embryos. To this end, we performed 
ChlP-seq with H3K4me3, H3K4me2, H3K4mel, H3K27ac, and 
H3K36me3 antibodies in Nematostella gastrulae and planula 
larvae. In addition, we performed RNA-seq to determine ex- 
pression levels of all genes and ChlP-seq using an antibody rec- 
ognizing the unphosphorylated C-terminal repeat of RNA poly- 
merase II (RNA Pol II), which recognized RNA Pol II bound at the 
TSSs of active genes in gastrulae (Supplemental Fig. S5) and planulae 
(Figs. 2, 3). 

To test whether this pattern is conserved in cnidarians, we 
plotted the distribution of histone modifications, RNA Pol II and 
p300, across distal p300 peaks aligned at their summits and genes 
aligned at their TSSs and transcription end sites (TESs) (Fig. 3; 
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Figure 1. Conserved distribution of chromatin marks across genes. (A) The distribution of H3K4me3 ChlP-seq reads (/-axis) is shown around genes of 
different species; human (The ENCODE Project Consortium 201 2), mouse (Xiao et al. 201 2), pig (Xiao et al. 201 2), fish (Bogdanovic et al. 201 2), fly (Bonn et al. 
2012), sea anemone (adult female polyps), and yeast (Maltby et al. 2012) aligned at their TSSs. TSSs were obtained from Ensembl Biomart, except for 
Nematostella, where they are based on RNA-seq data (D Fredman, M Schwaiger, F Rentzsch, and U Technau, in prep.). The x-axis spans -2 kb to +5 kb around 
TSSs. Genes with TSSs closer than 2 kb (yeast: 1 kb) to each other and shorter than 5 kb (yeast: 2 kb) were excluded from the analysis. (B) Heatmaps of histone 
modifications and CpG methylation (Zemach et al. 2010) in adult female Nematostella polyps around genes aligned at their TSSs, as in A. Each line of the 
heatmap represents a single gene (/-axis); only nonoverlapping genes longer than 5 kb were plotted. The colors indicate the number of reads on a log-scale 
(histone modifications) or the average percentage of CpG methylation. Note that many genes are transcribed in the opposite direction from nearby TSSs. 
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Figure 2. Many p300 peaks overlap with sites of open chromatin. Region surrounding the NvNcxl gene showing the distribution of p300 peaks (top, 
blue), gene models (black), p300 (blue), RNA Pol II (light green), several histone modifications (dark green), and input, (x-axis) Position on the scaffold; 
(/-axis) number of reads. The data are derived from planula larvae. 



Supplemental Fig. S5A). Indeed we found the distribution of 
chromatin marks to be highly similar to mammalian cells or Dro- 
sophila (Heintzman et al. 2007, 2009; The modENCODE Consor- 
tium et al. 2010; Negre et al. 201 1): H3K4me3 was located over TSSs 
of active genes, but hardly detected at distal p300 peaks, while 
H3K4mel was found at higher levels at distal p300 peaks (Figs. 3, 
4A,B; Supplemental Fig. S5A,B). H3K4me2 and H3K27ac were 
found both at promoters and distal p300 peaks (Fig. 3; Supple- 
mental Figs. S5A,B, S7A,B). RNA Pol II was mostly enriched at TSSs, 
but also found at distal p300 peaks (Fig. 3; Supplemental Fig. S5A). 
As expected, H3K36me3 was only located over transcribed gene 
bodies, which are also covered by RNA-seq reads (Fig. 3; Supple- 
mental Fig. S5A). Overall, active genes contained higher levels of 
p300 and active histone modifications at their TSSs and distal 
p300 peaks (Fig. 3; Supplemental Fig. S5A). This is in line with 
previously published analyses in bilaterians, where H3K4mel 
marks all enhancers, while active enhancers are distinguished 
from inactive ones by the presence of H3K27ac (Bonn and Fur- 
long 2008; Creyghton et al. 2010; Rada-Iglesias et al. 2010). Indeed 
we find H3K27ac peaks preferentially around active genes in the 
Nematostella as well as in the Drosophila genome (Supplemental 
Fig. S7D-G). Together, the distribution of chromatin marks across 
TSSs and distal p300 peaks suggests that not only promoter, but also 
enhancer modifications are highly conserved between bilaterians 
and cnidarians. To test whether this is specific to metazoans, we 
looked at the distribution of H3K4me3 and H3K4mel (Kirmizis 



et al. 2007) in the genome of Saccharomyces cerevisiae. While we find 
an enrichment of H3K4me3 compared with H3K4mel around TSSs 
in yeast, there is no enrichment of H3K4mel compared with 
H3K4me3 at TSS distal transcription factor peaks (Fig. 4C). Instead, 
H3K4mel is enriched downstream from TSSs within gene bodies 
(Liu et al. 2005). This suggests that distal enhancers as described in 
bilaterians exist also in Nematostella, but not in yeast. 

Prediction of gene regulatory elements in Nematostella 

In order to identify enhancer regions on a genome-wide level, we 
combined the distal p300 peaks and the information provided by 
the chromatin mark distributions to predict enhancers throughout 
the Nematostella genome. To predict chromatin states throughout 
the genome in an unbiased manner, we used ChromHMM (Ernst 
and Kellis 2012), a hidden Markov model-based classifier, to sep- 
arate the genome into six different chromatin states based on the 
five different histone modifications we assayed (Supplemental Fig. 
S6A,B). State 1 is defined by enrichment of H3K4me2, H3K4me3, 
and H3K27ac (Supplemental Fig. S6A), suggesting that it marks 
promoters. Indeed, we find it enriched over TSSs (Supplemental Fig. 
S6C,D). State 2 shares the same promoter-like modifications, but 
also contains H3K36me3 (Supplemental Fig. S6A), which suggests 
that it lies in gene bodies just downstream from the TSS (Supple- 
mental Fig. S6C,D). State 3 contains only H3K36me3 (Supplemental 
Fig. S6A) and is located over transcribed gene bodies (Supplemental 
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Figure 3. Enhancer-related chromatin modifications are associated with distal p300 peaks. Distribution of chromatin marks, RNA polymerase II, and 
p300 across distal p300 peaks and genes. Planula p300 peaks that do not overlap with TSSs were aligned relative to their peak summit (left plots), and 
genes were aligned relative to their annotated transcription start (middle plots) and end (right plots). The x-axis in each plot represents the position within 
the gene relative to peak summits, transcription start sites, and 3' ends. The /-axis in each plot represents the relative enrichment for epigenomic variables 
such as several histone modifications in the planula stage. (Red line) Nonexpressed genes (FPKM <1 .5). (Orange line) Lowly expressed genes. (Green line) 
Medium expressed genes. (Dark green line) Highly expressed genes. (Expressed genes) FPKM >2. The expressed genes were divided into three bins of an 
equal number of genes according to their FPKM values. 



Fig. S6C,D). State 4 contains all histone modifications except 
H3K36me3, and low levels of H3K4me3 (Supplemental Fig. S6A), 
suggesting that it marks mostly active enhancers. State 5 contains 
high levels of H3K4mel, low levels of H3K4me2, and lacks 
H3K4me3, a characteristic of enhancers (Supplemental Fig. S6A). 
State 6 covers the rest of the genome lacking any of the active 
modifications we profiled (Supplemental Fig. S6A). 



To predict enhancer elements, we selected only those distal 
p300 peaks, which overlapped chromatin states 4 or 5. This 
resulted in 2558 predicted enhancers in gastrulae and 4732 in 
planulae. Of those, 1543 were shared between both developmental 
stages. A similar overlap was found for p300 peaks between the two 
different developmental stages (Supplemental Fig. S7C). The ge- 
nomic location of all predicted enhancers and their distance to 
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Figure 4. H3K4me1 is enriched at distal p300 peaks in Nematostella. (A,B) Gastrula (A) or planula (B) p300 peaks were split into peaks >300 bp 
distal to transcription start sites (TSSs) (distal, left) and peaks within 300 bp around TSSs (TSS, right). On the /-axis, the enrichment of H3K4me1 (blue 
boxes) or H3K4me3 (gray boxes) normalized to input is plotted. Distal p300 peaks have higher H3K4me1 than H3K4me3 levels, a characteristic of 
enhancers. As expected, p300 peaks around TSSs are more enriched in H3K4me3. (C) Peaks of three different transcription factors (Rebl, Gal4, 
Phdl) derived from ChlP-exo experiments (Rhee and Pugh 201 1) in Saccharomyces cerevisiae were split into distal and TSS overlapping peaks as in 
A and B. A total of 70% of peaks overlapped TSSs, and the remaining peaks were still within 2 kb around the TSS. On the /-axis, the enrichment of 
H3K4me1 (blue boxes) or H3K4me3 (gray boxes) normalized to input (Kirmizis et al. 2007) is plotted. P-values were calculated using the Wilcoxon 
rank sum test. 



the closest TSS (on average 3898 bp) is provided in Supplemental 
Table S3. 

Interestingly, some predicted enhancers are occupied by 
RNA Pol II (Supplemental Fig. S9A). This has also been observed in 
bilaterians, where many enhancers are TSSs for diverse classes of 
lowly expressed noncoding RNAs (De Santa et al. 2010; Kim et al. 
2010; Wang et al. 2011; Chen et al. 2013). It may also indicate 
that enhancers are physically interacting with the transcription 
initiation complex at the TSS of their target gene. By looking for 
the TSSs located closest to each predicted enhancer, we identified 
2967 potential target genes. As expected, genes associated with 
gastrula-specific enhancers tend to be expressed at slightly but 
significantly higher levels in gastrulae compared with planulae, 
while planula-specific enhancers are located close to genes more 
highly transcribed in planulae than in gastrulae (Supplemental 
Fig. S9B). 

To investigate the dynamics of enhancer activity in Nem- 
atostella, we compared the levels of the two enhancer modifica- 
tions that we have profiled in three stages (gastrula, planula, and 
adult female polyp). Enrichments for H3K27ac and H3K4me2 at 
predicted enhancers were highly correlated between the gastrula 



and planula stages, but revealed a larger amount of enhancers with 
dynamic activity when compared with adult animals (Supple- 
mental Fig. S8). A total of 440 enhancers were acetylated in plan- 
ulae, and therefore likely active during early development, but 
inactive (significantly less acetylated) in adults (Supplemental Fig. 
S8C). Some of these enhancers were associated with genes in- 
volved in developmental transcriptional regulation (Supplemental 
Fig. S8G). On the other hand, 397 enhancers with significantly 
higher H3K27ac in adult polyps compared with planulae had 
a preference for association with neuronal genes (Supplemental 
Fig. S8C,G). 

Complex regulation of genes encoding transcription factors 
in Nematostella 

Having predicted thousands of gene regulatory elements through- 
out the Nematostella genome, we determined their distribution rel- 
ative to annotated genes. We found that gene ontology terms for 
transcriptional regulation, signaling, and developmental processes 
were significantly enriched among genes in the vicinity of predicted 
enhancers (Supplemental Fig. 10A). A total of 934 genes are po- 
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tentially regulated by multiple enhancer elements (Supplemental 
Fig. SI OB). Interestingly, genes encoding transcription factors are 
associated with multiple enhancers more often than housekeeping 
genes (Fig. 5 A). Despite apparently simple spatial and temporal 
expression patterns of most transcription factor genes in Nem- 
atostella (Steele et al. 201 1; Technau and Steele 201 1), this is similar 
to the situation in bilaterians, where many developmental regu- 
latory genes are regulated by a complex landscape of gene regula- 
tory elements (Bejerano et al. 2004; Woolfe et al. 2005; Hong et al. 
2008; Visel et al. 2008, 2009a; Heintzman et al. 2009; Arnold et al. 
2013). Indeed, we found that in Drosophila, a protostome model 
organism in which enhancers have been predicted in embryos 
genome wide (Negre et al. 2011), and to a slightly lesser extent in 
zebrafish, a vertebrate in which embryonic enhancers have re- 
cently been predicted (Bogdanovic et al. 2012), transcriptional 
regulatory genes tend to be associated with multiple enhancers 
(Supplemental Fig. 11). 



Genomic landscape of gene regulation 

In comparison with the Drosophila genome, the Nematostella ge- 
nome is about twice as large and contains an increased proportion 
of introns rather than intergenic regions (Putnam et al. 2007; 
Supplemental Fig. S12A,B). On the other hand, the zebrafish ge- 
nome is about five times larger than the Nematostella genome 
(Putnam et al. 2007; Howe et al. 2013) and contains 70% inter- 
genic sequence, where most predicted enhancers are located 
(Supplemental Fig. S12C). In contrast, in Nematostella and Dro- 
sophila, more enhancers are located in introns (Supplemental Fig. 
S12A,B). As expected, enhancers are depleted from the coding se- 
quence in Nematostella (Fig. 5B, P- value <0.001), just like in Dro- 
sophila (Fig. 5C, P-value <0.001) (Arnold et al. 2013) and zebrafish 
(Fig. 5D, P-value <0.001). However, only the large zebrafish ge- 
nome often lacks predicted enhancers within 1 kb upstream of 
transcription start sites (Fig. 5D; Supplemental Fig. S12C). All three 
species show an enrichment of regulatory sequences in the initial 
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Figure 5. Similar genomic distribution of predicted enhancers in different eumetazoans. (A) The number of genes associated with 1 , 2, 3, 4, and 5 
or more predicted enhancers in Nematostella is plotted for genes encoding transcription factors (pink) and housekeeping genes (gray). The counts of 
genes with a given number of predicted enhancers have been normalized to the counts of genes associated with a given number of shuffled 
predicted enhancers. (B-D) Distribution of predicted enhancer regions normalized to shuffled predicted enhancers across genomic annotations in 
Nematostella (B), Drosophila (C), and zebrafish (D). Positive numbers indicate enrichment, and negative numbers indicate depletion of predicted 
enhancers in a certain genomic region compared with the random expectation. Promoter regions are defined from the TSS to 1 kb upstream of 
theTSS. 
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introns of different genes (Fig. 5B-D, allP-values <0.001). Predicted 
enhancers are distributed at equal distances from TSSs in Nem- 
atostella and Drosophila, while they are distributed more broadly in 
zebrafish (Supplemental Fig. SI 2D). Together, our data suggest that 
the overall genomic distribution of enhancers is similar between 
Nematostella and Drosophila. Since the evolutionary distance be- 
tween Nematostella and bilaterian species precludes genome-wide 
alignment of their nucleotide sequences, we tested whether there 
was any trace of conservation between enhancers in orthologous 
regions by comparing the 6-mer DNA sequence content (Goke 
et al. 2012) between Nematostella and Drosophila or zebrafish-pre- 
dicted enhancers associated with gene orthologs, compared with 
predicted enhancers associated with random cross-species gene 
pairs. We could not detect any increased conservation, implying 
that these enhancers either do not share any common ancestry or 
have diverged beyond recognition by sequence similarity and even 
motif content (data not shown). 

In vivo validation of predicted enhancers 

To test the activity of distal enhancer elements we isolated a min- 
imal promoter of a Nematostella hsp70a gene, and confirmed that it 
does not induce the expression of a reporter gene on its own 
(Supplemental Table S4). We then tested 16 regions containing 
predicted enhancers for their ability to induce tissue-specific ex- 
pression of the reporter gene via the hsp70a or their endogenous 
promoter (Supplemental Table S4). We found that 12 of the tested 
regions drove expression in a mosaic and tissue-specific pattern at 
least partially reflecting the in situ hybridization expression pat- 
tern of the gene closest to the predicted enhancer (Fig. 6; Supple- 
mental Table S4). Four predicted enhancers did not drive any re- 
porter gene expression (Supplemental Table S4). The expression 
patterns of three genes were recovered with two independent en- 
hancer regions for each gene (Fig. 6G-I). Their expression domains 
appear to be at least partially overlapping, reminiscent of the re- 
dundant function of enhancers observed around some bilaterian 
genes (Hong et al. 2008). Reporter gene expression was mostly 
independent of the orientation or placement of the predicted en- 
hancer upstream of or downstream from the reporter gene (Sup- 
plemental Table S4). The only exception is the NvNcxl enhancer 
(Fig. 2), which drove expression only in reverse orientation (Sup- 
plemental Table S4). Interestingly, another predicted enhancer 
region >5 kb upstream of the nearest gene (Supplemental Fig. SI 3) 
drove expression specifically in neurons, with fluorescent neurites 
clearly visible (Supplemental Fig. S13). Taken together, at least 75% 
of Nematostella predicted enhancers likely represent true regulatory 
elements. 

In summary, our genome-wide analyses as well as the in vivo 
validated examples suggest that despite their rather simple ex- 
pression patterns and anatomy, Nematostella genes are subject to 
complex regulation by multiple enhancer elements. 

Discussion 

Gene regulatory elements are crucial for the regulation of gene 
expression in multicellular organisms (Levine 2010). Despite their 
importance for development, gene regulatory elements have so far 
only been studied in major bilaterian model organisms (Heintzman 
et al. 2007, 2009; Visel et al. 2009, 2013; The modENCODE Con- 
sortium et al. 2010; May et al. 2011; Negre et al. 2011; Bogdanovic 
et al. 2012; Shen et al. 2012; Chen et al. 2013; The ENCODE Project 
Consortium 2012; Neph et al. 2013). To obtain insights into the 
evolution of gene regulation in multicellular animals, we compared 



gene regulatory mechanisms in bilaterians and cnidarians, which 
split at least 600 million yr ago. Here we report the first compre- 
hensive map of gene regulatory elements as well as chromatin 
modifications in a nonbilaterian animal, the sea anemone Nem- 
atostella vectensis. We find a high degree of conservation in the 
genomic distribution of chromatin modifications and gene regu- 
latory elements between cnidarians and bilaterians. Promoter 
histone modifications are shared in most eukaryotes, and en- 
hancers are marked by the same histone modifications in Nem- 
atostella and bilaterian model organisms. While the function of 
these histone modifications at regulatory elements is still unclear 
(Calo and Wysocka 2013), it suggests conserved targeting mecha- 
nisms of the histone modifying enzymes to chromatin. 

Studies in mammalian cells have shown that H3K4me3 at 
CpG islands inhibits the methylation of CpGs (Ooi et al. 2007; 
Thomson et al. 2010). Since Drosophila melanogaster and Caenor- 
hahditis elegans, the most common invertebrate model organisms, 
lack significant amounts of DNA methylation, it was not clear 
whether this is specific to vertebrates or whether this feature has 
been lost in Drosophila and Caenorhabditis (Zemach and Zilberman 
2010). Comparisons between species showed that invertebrate 
DNA methylation is found at the same genes as active histone 
modifications in Drosophila (Nanty et al. 2011; Sarda et al. 2012; 
Hunt et al. 2013). Nematostella is the first invertebrate where both 
DNA methylation (Zemach et al. 2010) and histone modifications 
(this study) have been analyzed genome wide. We could show that 
in Nematostella, H3K4me3 and DNA methylation are found in 
mutually exclusive regions of the same genes. This suggests that 
the inhibition of DNA methylation by H3K4me3 might be occur- 
ring not only in mammals but also in cnidarians. Detailed mech- 
anistic studies of the targeting of DNA methylation in Nem- 
atostella, as well as in protostomes containing DNA methylation, 
will be required to determine whether an inhibitory regulation of 
DNA methylation by H3K4me3 is ancestral to eumetazoans. Since 
DNA methylation and active histone modifications are found at 
the same genes in Nematostella, we suspect that they share a com- 
mon targeting mechanism, most likely transcriptional activity. 

The analysis of the Nematostella genome revealed that it is 
more similar to vertebrate genomes in terms of gene content and 
genomic organization (exon-intron structures, broad gene syn- 
teny) than to the genomes of common invertebrate model or- 
ganisms (Putnam et al. 2007). This suggested that bilaterians, 
rather than acquiring many novel genes, mostly evolved more 
elaborate mechanisms to regulate gene expression. This might be 
the case for post-transcriptional gene regulation mediated by 
microRNAs, which we found to differ substantially in cnidarians 
and bilaterians (Moran et al. 2014). Thus, we expected also the 
gene regulatory landscape of Nematostella to be less complex than 
that of the investigated bilaterian model species. 

Surprisingly, our findings do not support this view. Instead, 
careful analysis indicates that despite lacking CTCF (Heger et al. 
2012), a factor implicated in gene regulation by distal enhancers 
(Merkenschlager and Odom 2013), the gene regulatory landscape 
of Nematostella shows a similar complexity as that of bilaterians. In 
all animals studied so far, genes encoding tissue-specific tran- 
scription factors are regulated by a high number of enhancers 
(Bejerano et al. 2004; Woolfe et al. 2005; Hong et al. 2008; Visel 
et al. 2008; Heintzman et al. 2009; Visel et al. 2009a). Furthermore, 
predicted enhancers are enriched in the same genomic regions and 
at a similar distance from transcription start sites in Nematostella 
and Drosophila. While noncoding transcription at enhancers will 
need to be investigated in detail, the enrichment of RNA Pol II at 
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Figure 6. Predicted enhancer elements activate transcription in vivo. (A-l) Whole-mount in situ hybridizations of Nematostella embryos and primary 
polyps. (A'-l') Fluorescent mOrange2 signal of live embryos or primary polyps injected with a construct where a predicted enhancer region of the 
indicated Nematostella gene was driving mOrange2 expression. (G"-l") Fluorescent mOrange2 signal of live primary polyps injected with a construct 
where a second predicted enhancer region (different from the region in C'-/') of the indicated Nematostella gene was driving mOrange2 expression. The 
Nematostella gene names are indicated inside the in situ hybridization pictures (Nv in the beginning of the gene name was omitted due to space 
constraints). The white scale bars represent 1 00 u.m. All pictures were taken at the primary polyp stage (>8 d post-fertilization; lateral view) except D and E, 
which depict planula larvae (lateral view: D,D'; oral view: £,£'). (J,K) Schematic representation of a Nematostella planula larva (J) and primary polyp (K). 



predicted enhancers already suggests that some enhancers in 
Nematostella could transcribe noncoding RNAs, as is the case in 
bilaterians (De Santa et al. 2010; Kim et al. 2010; Wang et al. 2011; 
Chen et al. 2013). One example could be the NvNcxl enhancer, 
which activates transcription in an orientation-dependent manner 
and which is bound by low levels of RNA Pol II. This could reflect 
a common mechanism of transcriptional activation by these en- 
hancers. It is possible that this mechanism does not involve 
chromatin looping through CTCF, but read-through low-level 
transcription from the enhancer to the downstream promoter. 
However, most enhancers we tested in vivo also activated tran- 
scription when placed downstream from the hsp70 basal promoter 
and the reporter gene, suggesting that this is not the only mech- 
anism of activation in Nematostella. Instead, just like in bilaterians, 



most Nematostella distal enhancers can activate transcription from 
a promoter independently of the position and orientation relative 
to that promoter. This suggests that chromatin looping of en- 
hancers to their target promoters in Nematostella occurs through 
CTCF-independent cohesin binding, which plays a role in en- 
hancer looping in bilaterians (Pauli et al. 2008; Schuldiner et al. 
2008; Phillips and Corces 2009; Kagey et al. 2010; Schmidt et al. 
2010; Seitan et al. 2011; Faure et al. 2012; Merkenschlager and 
Odom 2013). The enrichment of RNA Pol II at enhancers could also 
be an indication for physical interaction of enhancers and their 
target promoters in Nematostella. In the future, detailed mecha- 
nistic studies of chromatin looping and noncoding RNA function 
in Nematostella will identify the factors required for enhancer 
promoter interaction in this organism. In addition, examining 
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histone modifications associated with inactive or repressed chro- 
matin will give a more complete picture of the cnidarian gene 
regulatory landscape. 

Taken together, we provide the first genome-wide analysis of 
gene regulation in a nonbilaterian animal and thereby shed light 
on the evolution of morphological complexity in metazoans. Our 
results suggest that a complex gene regulatory architecture as well 
as epigenomic chromatin signatures predate the divergence of 
bilaterians and cnidarians from their last common ancestor. 

Methods 

Antibodies 

We used a 146-aa peptide of the C terminus of Nematostella p300 
protein, including a N-terminal 10 X His tag for immunization of 
two rabbits each. p300 antisera and purified p300 antibodies were 
generated by Primm (www.primmbiotech.com). We used a mouse 
monoclonal antibody against the unphosphorylated C-terminal 
repeat of RNA polymerase II (8WG16, kind gift from C. Wirbelauer), 
as well as polyclonal antibodies against H3K4me3 (Diagenode, 
pAb-003-050), H3K4me2 (Diagenode, pAb-035-050), H3K4mel 
(Abeam, ab8895), H3K36me3 (Diagenode, pAb-058-050), and 
H3K27ac (Abeam, ab4729). The entire N-terminal tail of histone 
H3 is perfectly conserved between Nematostella and model organ- 
isms where these antibodies have been used successfully. 

Chromatin immunoprecipitation 

Fertilized eggs from —500 female Nematostella polyps were grown 
to the mid-gastrula or planula stage as described (Fritzenwanker 
and Technau 2002). After removal from jelly, eggs were fixed in 2% 
formaldehyde for 12 min at room temperature. Cross-linking was 
stopped using glycine, and embryos were homogenized in nuclei 
buffer 1 (50 mM HEPES at pH 7.5, 140 mM NaCl, 1 mM EDTA, 10% 
Glycerol, 0.5% NP40, 0.25% Triton X-100, 1 mM DTT) and 
centrifuged at 500# for 3 min. After another centrifugation of the 
supernatant at 2000# for 10 min, the pelleted nuclei were washed 
with nuclei buffer 2 (50 mM HEPES at pH 7.5, 140 mM NaCl, 1 mM 
EDTA, 10% Glycerol, 0.5% NP40, 0.25% Triton X-100) and finally 
resuspended in Lysis buffer (50 mM HEPES at pH 7.5, 500 mM 
NaCl, 1 mM EDTA, 0.1% DOC, 0.1% Triton X-100, 0.1% SDS). 
After shearing the chromatin to 100-200 bp using a Covaris in- 
strument and removing the nonsoluble material, we blocked the 
chromatin with preblocked (1 mg/mL tRNA, 1 mg/mL BSA) pro- 
tein A beads CL-4B (GE Healthcare 71-7090-00 AE). At this point, 
we removed an aliquot of chromatin for input DNA and incubated 
the remaining chromatin with the antibodies, rotating at 4°C 
overnight. We then added preblocked protein A beads and washed 
the beads twice with Lysis buffer, twice with DOC buffer (10 mM 
Tris at pH 8, 0.25 M LiCl, 0.5% NP-40, 0.5% DOC, 1 mM EDTA), 
and once with TE buffer. Chromatin was eluted from the beads 
twice with elution buffer (1% SDS, 0.1 M NaHC03) and de-cross- 
linked. Libraries of immunoprecipitated DNA or input DNA were 
prepared according to Illumina's instructions (catalog number IP- 
102-1001) and deep sequencing was performed at the CSF NGS 
Unit (http://csf.ac.at/) with 36 bp GA II or 50 bp HiSeq reads. 

RNA-seq 

RNA of Nematostella embryos was isolated using TRIzol followed by 
DNase I treatment, at the same developmental time points as de- 
scribed for ChlP. For two planula and one gastrula sample, poly(A) 
RNA was isolated and libraries were prepared according to Illumi- 
na's instructions. The second gastrula sample was subjected to ri- 



bosomal RNA depletion (RiboMinus) and strand-specific library 
preparation using the dUTP method (Levin et al. 2010). Deep se- 
quencing by Illumina HiSeq and GA IIx was performed at the CSF 
NGS Unit with 76-bp paired end reads. 

Data analysis 

ChlP-seq reads were mapped to the genome using BWA (Li and 
Durbin 2009) with default settings and those with a mapping 
quality score <30 were discarded. We used the following genome 
versions: Nematostella: Nemvecl; Drosophila: dm3; zebrafish: zv8; 
pig: SusScr2; mouse: mm9; human: hgl9; yeast: SacCer3. Further 
analysis was performed using the BamTools and BEDTools suites 
(Quinlan and Hall 2010; Barnett et al. 2011) as well as custom perl 
and R (R Development Core Team 2012) scripts. K-means cluster- 
ing was performed using Cluster 3.0 (Eisen et al. 1998). Chromatin 
states were predicted across the genome using ChromHMM (Ernst 
and Kellis 2012). p300 peaks were detected using the peakzilla 
software (Bardet et al. 2013). See Supplemental Methods for details. 

In vivo analysis of regulatory elements 

Predicted gene regulatory elements were tested for their activity to 
induce expression of mOrange2 in developing Nematostella em- 
bryos and polyps as described (Renfer et al. 2010). In situ hybrid- 
izations on Nematostella embryos and primary polyps were per- 
formed as described (Genikhovich and Technau 2009). 

Data access 

All data sets have been submitted to the NCBI Gene Expression 
Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo) (Edgar et al. 
2002) under accession number GSE46488. 
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